Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
We present PUTNAMBENCH, a new multilingual benchmark for evaluating the ability of neural theorem-provers to solve competition mathematics problems. PUTNAMBENCH consists of 1697 hand-constructed formalizations of 640 theorems sourced from the William Lowell Putnam Mathematical Competition, the premier undergraduate-level mathematics competition in North America. All the theorems have formalizations in Lean 4 and Isabelle; a substantial subset also has Coq formalizations. Proving the theorems requires significant problem-solving ability and proficiency in a broad range of topics taught in undergraduate mathematics courses. We use PUTNAMBENCH to evaluate several established neural and symbolic theorem-provers. These approaches can only solve a handful of the PUTNAMBENCH problems, establishing the benchmark as a difficult open challenge for research on neural theorem-proving. PUTNAMBENCH is available at https://github.com/trishullab/PutnamBench.more » « lessFree, publicly-accessible full text available December 7, 2025
-
Free, publicly-accessible full text available December 1, 2025
-
We present PutnamBench, a new multi-language benchmark for evaluating the ability of neural theorem-provers to solve competition mathematics problems. PutnamBench consists of 1692 hand-constructed formalizations of 640 theorems sourced from the William Lowell Putnam Mathematical Competition, the premier undergraduate-level mathematics competition in North America. All the problems have formalizations in Lean 4 and Isabelle; a substantial subset also has Coq formalizations. PutnamBench requires significant problem-solving ability and proficiency in a broad range of topics taught in undergraduate mathematics courses. We use PutnamBench to evaluate several established neural and symbolic theorem-provers. These approaches can only solve a handful of the PutnamBench problems, establishing the benchmark as a difficult open challenge for research on neural theorem-proving. PutnamBench is available at https://github.com/trishullab/PutnamBench.more » « lessFree, publicly-accessible full text available December 1, 2025
An official website of the United States government

Full Text Available